Skip to content

CDRIVER-5815 support disabling socket timeouts#2185

Closed
connorsmacd wants to merge 50 commits intomongodb:masterfrom
connorsmacd:sockettimeoutms_0.CDRIVER-5815
Closed

CDRIVER-5815 support disabling socket timeouts#2185
connorsmacd wants to merge 50 commits intomongodb:masterfrom
connorsmacd:sockettimeoutms_0.CDRIVER-5815

Conversation

@connorsmacd
Copy link
Collaborator

@connorsmacd connorsmacd commented Dec 4, 2025

Reference

CDRIVER-5815
CDRIVER-6177

Summary

This PR makes it possible to disable socket timeouts via socketTimeoutMS.

According to the spec, socketTimeoutMS=0 shall be used to disable socket timeouts. However, the C Driver does not comply with this. Since URI option values of 0 are interpreted as unset (see CDRIVER-3730), the default timeout value is used instead. Normally, this would be okay since the spec dictates that the default timeout should be "no timeout", but the C Driver is yet again not spec-compliant as it uses a default of 5 minutes.

To avoid a breaking change, this PR introduces an alternate string option "inf" that can be used to disable the socket timeout. This is a stopgap measure that should be reevaluated for the next major release.

The C Driver also treated timeout values of 0 as "immediate timeouts" just like the POSIX system call poll. Since this is not aligned with the spec, the C Driver was changed to internally treat timeout values of 0 as infinite timeouts. However, there are some parts of the C Driver that were intentionally using immediate timeouts for non-blocking behavior, so a new sentinel MONGOC_SOCKET_TIMEOUT_IMMEDIATE is introduced, which currently is set to the value of INT32_MIN.

Caution

This makes specifying special timeout values very confusing since there are different interpretations depending on the context and representation:

URI Spec socketTimeoutMS C Driver URI socketTimeoutMS C Driver Internal INT32 Representation POSIX INT32 Representation (e.g., the `timeout` arg for `poll`)
Immediate timeout cannot specify cannot specify -2147483648 (INT32_MIN) 0
Infinite/disabled timeout 0 "inf" 0 normally, but technically [-2147483647, 0] due to CDRIVER-4781 <0

@connorsmacd connorsmacd changed the title CDRIVER-5815 treat socketTimeoutMS=0 as infinite timeout CDRIVER-5815 support disabling socket timeouts Dec 17, 2025
@connorsmacd connorsmacd marked this pull request as ready for review December 17, 2025 20:33
@connorsmacd connorsmacd requested a review from a team as a code owner December 17, 2025 20:33
@kevinAlbs kevinAlbs requested a review from eramongodb December 17, 2025 20:35
Copy link
Contributor

@eramongodb eramongodb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am excited to see the further reduction of explicit timeout arithmetic throughout the codebase with the new _mongoc_stream_tls_timer_* helpers.

Just noting an observation:

int64_t timeout_ms = 0; // -> inf
timeout_ms = _mongoc_stream_tls_timer_to_timeout_msec(_mongoc_stream_tls_timer_from_timeout_msec(timeout_ms));
ASSERT_CMPINT32(timeout_ms, ==, MONGOC_SOCKET_TIMEOUT_INFINITE);

} else if (timeout_msec == 0) {
if (timeout_msec == MONGOC_SOCKET_TIMEOUT_IMMEDIATE) {
return 0;
} else if (timeout_msec <= 0) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On further look, this might be further complicated by streams wrapping streams. For example, a TLS stream might pass a timeout to an underlying base stream:

mongoc_stream_t *base_stream = get_stream();

// Wrap in a TLS stream:
mongoc_ssl_opt_t ssl_opt = {0};
mongoc_stream_t *tls_stream = mongoc_stream_tls_new_with_hostname(base_stream, "localhost", &ssl_opt, 1 /* client */);
ASSERT(tls_stream);

int32_t timeout_ms = 0; // stream convention: 0 means immediate.
size_t ret = mongoc_stream_write(tls_stream, "foo", 3, timeout_ms);
// mongoc_stream_write converts to socket convention: 0 => MONGOC_SOCKET_TIMEOUT_IMMEDIATE
// mongoc_stream_write is called on base_stream with socket convention (but should be stream convention)

It might be confusing to track if timeout_msec is in "socket convention" or "stream convention". This might point to a need for a larger refactor of how timeout representation (a new internal mongoc_socket_timeout_t type?). But for the scope of "socketTimeoutMS=inf", I would rather err towards a simpler change.

Admittedly, I do not see an obvious simplification. Possibly: can the internal representation of the timeout remain the same? Instead, add internal-only stream API for infinite timeout:

// mongoc_stream_writev is public API.
ssize_t
mongoc_stream_writev(mongoc_stream_t *stream, mongoc_iovec_t *iov, size_t iovcnt, int32_t timeout_msec)
{
   // CDRIVER-4781: for backward compatibility.
   if (timeout_msec < 0) {
      timeout_msec = MONGOC_DEFAULT_TIMEOUT_MSEC;
   }
   return mongoc_stream_writev_impl(stream, iov, iovcnt, timeout_msec);
}

// mongoc_stream_writev_notimeout is internal-only.
ssize_t
mongoc_stream_writev_notimeout(mongoc_stream_t *stream, mongoc_iovec_t *iov, size_t iovcnt)
{
   int32_t timeout_msec = -1; // Infinite.
   return mongoc_stream_writev_impl(stream, iov, iovcnt, timeout_msec);
}

} else if (timeout_msec == 0) {
if (timeout_msec == MONGOC_SOCKET_TIMEOUT_IMMEDIATE) {
return 0;
} else if (timeout_msec <= 0) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extending debug_stream_t can test how socketTimeoutMS is passed through to streams set with mongoc_client_set_stream_initiator. See this test for an example.

socketTimeoutMS=INT32_MIN may be a problem. It changes to 0 resulting in immediate timeout:

ASSERT_OR_PRINT(ok, error);                               // After PR: error!
ASSERT_CMPINT32(stats.last_timeout_readv, ==, INT32_MIN); // After PR: 0
ASSERT_CMPINT32(stats.last_timeout_writev, ==, 3600000);  // After PR: 0

IMO: I think "inf" is a convenience more than a necessity. I would rather avoid adding much complexity and avoid any possibly breaking changes.

If "inf" still seems worth pursuing, consider separating refactors into a new PR (e.g. adding a socket timeout convention + changing the TLS timeout handling). That might help separate what changes are needed for "inf" from changes that are intended as improvements to existing timeout handling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants